AITopics | Loiret

Collaborating Authors

Loiret

PUe: Biased Positive-Unlabeled Learning Enhancement by Causal Inference

Neural Information Processing SystemsFeb-10-2026, 20:36:08 GMT

Positive-Unlabeled (PU) learning aims to achieve high-accuracy binary classification with limited labeled positive examples and numerous unlabeled ones. Existing cost-sensitive-based methods often rely on strong assumptions that examples with an observed positive label were selected entirely at random. In fact, the uneven distribution of labels is prevalent in real-world PU problems, indicating that most actual positive and unlabeled data are subject to selection bias. In this paper, we propose a PU learning enhancement (PUe) algorithm based on causal inference theory, which employs normalized propensity scores and normalized inverse probability weighting (NIPW) techniques to reconstruct the loss function, thus obtaining a consistent, unbiased estimate of the classifier and enhancing the model's performance. Moreover, we investigate and propose a method for estimating propensity scores in deep learning using regularization techniques when the labeling mechanism is unknown. Our experiments on three benchmark datasets demonstrate the proposed PUe algorithm significantly improves the accuracy of classifiers on non-uniform label distribution datasets compared to advanced cost-sensitive PU methods.

artificial intelligence, machine learning, propensity score, (20 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.04)
Europe > France > Centre-Val de Loire > Loiret > Orleans (0.04)

Genre: Research Report (0.68)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

3efb4bdc6bfe13e1ff95b4407c37961d-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 12:59:48 GMT

artificial intelligence, data mining, machine learning, (22 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.04)
Europe > France > Centre-Val de Loire > Loiret > Orleans (0.04)

Genre: Research Report (0.68)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Add feedback

Producer-Fairness in Sequential Bundle Recommendation

Rio, Alexandre, Soare, Marta, Amer-Yahia, Sihem

arXiv.org Artificial IntelligenceJun-26-2025

We address fairness in the context of sequential bundle recommendation, where users are served in turn with sets of relevant and compatible items. Motivated by real-world scenarios, we formalize producer-fairness, that seeks to achieve desired exposure of different item groups across users in a recommendation session. Our formulation combines naturally with building high quality bundles. Our problem is solved in real time as users arrive. We propose an exact solution that caters to small instances of our problem. We then examine two heuristics, quality-first and fairness-first, and an adaptive variant that determines on-the-fly the right balance between bundle fairness and quality. Our experiments on three real-world datasets underscore the strengths and limitations of each solution and demonstrate their efficacy in providing fair bundle recommendations without compromising bundle quality.

artificial intelligence, machine learning, recommendation, (19 more...)

arXiv.org Artificial Intelligence

2506.20329

Country:

Asia > South Korea (0.14)
Asia > Japan (0.04)
Asia > China (0.04)
(4 more...)

Genre: Research Report (0.64)

Industry: Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.68)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

Connecting Voices: LoReSpeech as a Low-Resource Speech Parallel Corpus

Ouzerrout, Samy

arXiv.org Artificial IntelligenceFeb-25-2025

Aligned audio corpora are fundamental to NLP technologies such as ASR and speech translation, yet they remain scarce for underrepresented languages, hindering their technological integration. This paper introduces a methodology for constructing LoReSpeech, a low-resource speech-to-speech translation corpus. Our approach begins with LoReASR, a sub-corpus of short audios aligned with their transcriptions, created through a collaborative platform. Building on LoReASR, long-form audio recordings, such as biblical texts, are aligned using tools like the MFA. LoReSpeech delivers both intra- and inter-language alignments, enabling advancements in multilingual ASR systems, direct speech-to-speech translation models, and linguistic preservation efforts, while fostering digital inclusivity. This work is conducted within Tutlayt AI project (https://tutlayt.fr).

corpora, translation, under-represented language, (14 more...)

arXiv.org Artificial Intelligence

2502.18215

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Quebec > Montreal (0.06)
Europe > Middle East > Republic of Türkiye > Istanbul Province > Istanbul (0.04)
(4 more...)

Genre: Research Report (0.40)

Industry: Media (0.36)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.78)

Add feedback

Memory-efficient Continual Learning with Neural Collapse Contrastive

Dang, Trung-Anh, Nguyen, Vincent, Vu, Ngoc-Son, Vrain, Christel

arXiv.org Artificial IntelligenceDec-6-2024

Contrastive learning has significantly improved representation quality, enhancing knowledge transfer across tasks in continual learning (CL). However, catastrophic forgetting remains a key challenge, as contrastive based methods primarily focus on "soft relationships" or "softness" between samples, which shift with changing data distributions and lead to representation overlap across tasks. Recently, the newly identified Neural Collapse phenomenon has shown promise in CL by focusing on "hard relationships" or "hardness" between samples and fixed prototypes. However, this approach overlooks "softness", crucial for capturing intra-class variability, and this rigid focus can also pull old class representations toward current ones, increasing forgetting. Building on these insights, we propose Focal Neural Collapse Contrastive (FNC^2), a novel representation learning loss that effectively balances both soft and hard relationships. Additionally, we introduce the Hardness-Softness Distillation (HSD) loss to progressively preserve the knowledge gained from these relationships across tasks. Our method outperforms state-of-the-art approaches, particularly in minimizing memory reliance. Remarkably, even without the use of memory, our approach rivals rehearsal-based methods, offering a compelling solution for data privacy concerns.

artificial intelligence, learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2412.02865

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > France > Centre-Val de Loire > Loiret > Orleans (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Tree species classification at the pixel-level using deep learning and multispectral time series in an imbalanced context

Mouret, Florian, Morin, David, Planells, Milena, Vincent-Barbaroux, Cécile

arXiv.org Machine LearningNov-27-2024

This paper investigates tree species classification using Sentinel-2 multispectral satellite image time-series. Despite their critical importance for many applications, such maps are often unavailable, outdated, or inaccurate for large areas. The interest of using remote sensing time series to produce these maps has been highlighted in many studies. However, many methods proposed in the literature still rely on a standard classification algorithm, usually the Random Forest (RF) algorithm with vegetation indices. This study shows that the use of deep learning models can lead to a significant improvement in classification results, especially in an imbalanced context where the RF algorithm tends to predict towards the majority class. In our use case in the center of France with 10 tree species, we obtain an overall accuracy (OA) around 95% and a F1-macro score around 80% using three different benchmark deep learning architectures. In contrast, using the RF algorithm yields an OA of 93% and an F1 of 60%, indicating that the minority classes are not classified with sufficient accuracy. Therefore, the proposed framework is a strong baseline that can be easily implemented in most scenarios, even with a limited amount of reference data. Our results highlight that standard multilayer perceptron can be competitive with batch normalization and a sufficient amount of parameters. Other architectures (convolutional or attention-based) can also achieve strong results when tuned properly. Furthermore, our results show that DL models are naturally robust to imbalanced data, although similar results can be obtained using dedicated techniques.

algorithm, classification, tree species, (15 more...)

arXiv.org Machine Learning

2408.08887

Country:

Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(9 more...)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Natural Language Querying System Through Entity Enrichment

Amavi, Joshua, Ferrari, Mirian Halfeld, Hiot, Nicolas

arXiv.org Artificial IntelligenceOct-21-2024

This paper focuses on a domain expert querying system over databases. It presents a solution designed for a French enterprise interested in offering a natural language interface for its clients. The approach, based on entity enrichment, aims at translating natural language queries into database queries. In this paper, the database is treated through a logical paradigm, suggesting the adaptability of our approach to different database models. The good precision of our method is shown through some preliminary experiments.

artificial intelligence, natural language, text processing, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-030-55814-7

2410.15753

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > France > Centre-Val de Loire > Loiret > Orleans (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.69)

Add feedback

What the Harm? Quantifying the Tangible Impact of Gender Bias in Machine Translation with a Human-centered Study

Savoldi, Beatrice, Papi, Sara, Negri, Matteo, Guerberof, Ana, Bentivogli, Luisa

arXiv.org Artificial IntelligenceOct-7-2024

Gender bias in machine translation (MT) is recognized as an issue that can harm people and society. And yet, advancements in the field rarely involve people, the final MT users, or inform how they might be impacted by biased technologies. Current evaluations are often restricted to automatic methods, which offer an opaque estimate of what the downstream impact of gender disparities might be. We conduct an extensive human-centered study to examine if and to what extent bias in MT brings harms with tangible costs, such as quality of service gaps across women and men. To this aim, we collect behavioral data from 90 participants, who post-edited MT outputs to ensure correct gender translation. Across multiple datasets, languages, and types of users, our study shows that feminine post-editing demands significantly more technical and temporal effort, also corresponding to higher financial costs. Existing bias measurements, however, fail to reflect the found disparities. Our findings advocate for human-centered approaches that can inform the societal impact of bias.

computational linguistic, proceedings, translation, (12 more...)

arXiv.org Artificial Intelligence

2410.00545

Country:

North America > United States > Washington > King County > Seattle (0.14)
Europe > Finland > Pirkanmaa > Tampere (0.05)
Asia > Singapore (0.05)
(29 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education > Educational Setting (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Joint Channel Selection using FedDRL in V2X

Mancini, Lorenzo, Labbi, Safwan, Meraim, Karim Abed, Boukhalfa, Fouzi, Durmus, Alain, Mangold, Paul, Moulines, Eric

arXiv.org Artificial IntelligenceOct-3-2024

Vehicle-to-everything (V2X) communication technology is revolutionizing transportation by enabling interactions between vehicles, devices, and infrastructures. This connectivity enhances road safety, transportation efficiency, and driver assistance systems. V2X benefits from Machine Learning, enabling real-time data analysis, better decision-making, and improved traffic predictions, making transportation safer and more efficient. In this paper, we study the problem of joint channel selection, where vehicles with different technologies choose one or more Access Points (APs) to transmit messages in a network. In this problem, vehicles must learn a strategy for channel selection, based on observations that incorporate vehicles' information (position and speed), network and communication data (Signal-to-Interference-plus-Noise Ratio from past communications), and environmental data (road type). We propose an approach based on Federated Deep Reinforcement Learning (FedDRL), which enables each vehicle to benefit from other vehicles' experiences. Specifically, we apply the federated Proximal Policy Optimization (FedPPO) algorithm to this task. We show that this method improves communication reliability while minimizing transmission costs and channel switches. The efficiency of the proposed solution is assessed via realistic simulations, highlighting the potential of FedDRL to advance V2X technology.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

arXiv.org Artificial Intelligence

2410.20687

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry:

Transportation > Ground > Road (0.48)
Automobiles & Trucks (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.95)

Add feedback

Talking to Machines: do you read me?

Rojas-Barahona, Lina M.

arXiv.org Artificial IntelligenceJul-2-2024

In this dissertation I would like to guide the reader to the research on dialogue but more precisely the research I have conducted during my career since my PhD thesis. Starting from modular architectures with machine learning/deep learning and reinforcement learning to end-to-end deep neural networks. Besides my work as research associate, I also present the work I have supervised in the last years. I review briefly the state of the art and highlight the open research problems on conversational agents. Afterwards, I present my contribution to Task-Oriented Dialogues (TOD), both as research associate and as the industrial supervisor of CIFRE theses. I discuss conversational QA. Particularly, I present the work of two PhD candidates Thibault Cordier and Sebastien Montella; as well as the work of the young researcher Quentin Brabant. Finally, I present the scientific project, where I discuss about Large Language Models (LLMs) for Task-Oriented Dialogue and Multimodal Task-Oriented Dialogue.

language resource and evaluation, sentence and context representation, sentence representation, (15 more...)

arXiv.org Artificial Intelligence

2407.02354

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.13)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(35 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Leisure & Entertainment > Games (1.00)
Health & Medicine > Therapeutic Area (0.92)
Government > Regional Government (0.67)
Education > Educational Setting > Online (0.45)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(8 more...)

Add feedback